<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>

<head>
  <title>Speak</title>
  <meta name="GENERATOR" content="Quanta Plus">
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<hr>
<h2>VOICES</h2>
<hr>
<h3>Voice Files Provided</h3>
A number of Voice files are provided in the <strong>speak-data/voices</strong> directory.
You can select one of these with the <strong>-v &lt;voice name&gt;</strong> parameter to the
speak command.
<p>
<dl>
<dt>
<strong>default</strong><br>
<dd>   This voice is used if none is specified in the speak command.
<p>
<dt>
<strong>en</strong><br>
<dd>   is the standard default English voice.
<p>
<dt>
<strong>en-b<br>
en-c<br>
en-d</strong><br>
<dd>   are different English voices.  These can be considered caricatures of
   various British accents: Northern, Upper-Class, West Midlands
   respectively.
<p>
<dt>
<strong>en-f, en-fb, en-fc, en-fd</strong><br>
<dd>   Female versions of the above. Not genuine female voices, just variants
   with different pitch and formant parameters.
<p>
<dt>
<strong>en1</strong><br>
<dd>   A variations of "en" with Echo.<br>
   Adding a little Echo can give a clearer or more interesting sound.
<p>
<dt>
<strong>en2</strong> to <strong>en8</strong><br>
<dd>   Variations of "en" with different tonal quality.<br>
<p>
<dt>
<strong>esperanto</strong><br>
<dd>   An illustration of a different language.  Esperanto has simple
   pronunciation rules, and a different stress pattern from English.  I
   don't know how Esperanto is supposed to sound, other than what I've
   read in an introduction.  There are some Esperanto texts on
   <a href="http://www.gutenberg.org">www.gutenberg.org</a>.  Text can be either in the Latin3 alphabet, or
   else use the Latin1 convention of using two-letter combinations (cx,
   gx, etc).
<p>
<dt>
<strong>german</strong><br>
<dd>A very cursory attempt at German.  This is not a serious
   implementation, with only very simple and inadequate pronunciation
   rules, giving many wrong pronunciations and wrong stress placement.
   I have only a small knowledge of German, from school many years ago,
   but I can at least tell that the prosody needs considerable adjustment.
   Also very noticable is the post-vocalic R sound (R which is not
   followed by a vowel, which doesn't exist in my own speech) sounds very
   odd, so that will need some work.
</dl>
<hr>
<h3>Contents of Voice Files</h3>
(subject to change)
<p>
<dl>
<dt>
<strong>language &nbsp;&lt;name&gt;</strong><br>
<dd>This parameter should appear first.  It selectes default behaviour
   and characteristics for the language, and sets default values for
   "phonemes", "dictionary" and other parameters.
   If omitted, "english" is assumed.
<p>
<dt>
<strong>phonemes &nbsp;&lt;name&gt;</strong><br>
<dd>Specifies which set of phonemes to use from those contained in the
   phontab, phonindex, and phondata data files.
<p>
   Different voices of the same language can use different phoneme sets,
   to give different accents.  A default "phonemes" value is set by the
   "language" parameter.
<p>
<dt>
<strong>dictionary &nbsp;&lt;name&gt;</strong><br>
<dd>   Specifies which pair of dictionary files to use.  eg. "english"
   indicates that <em>speak-data/english_1</em> and <em>speak_data/english_2</em> should
   be used to translate from words to phonemes.  This parameter is usually
   not needed as it is set by default by the "language" parameter.
<p>
<dt>
<strong>pitch &nbsp;&lt;base&gt; &lt;range&gt;</strong><br>
<dd>   Two integer values.
   The first gives a base pitch to the voice (value in Hz)
   The second controls the range of pitches used by the voice. Setting
   it equal to the base pitch will give a monotone.
<p>
<dt>
<strong>formant &nbsp;&lt;number&gt; &lt;frequency&gt; &lt;strength&gt; &lt;width&gt;</strong><br>
<dd>   Systematically adjusts the frequency, strength, and width of the
   resonance peaks of the voice.  Values are percentages of the
   default values.  Changing these affects the tone/quality of the voice.
<ul>
   <li>Formants 1,2,3 are the standard three formants which define vowels.</li>
   <li>Formant 0 is used to give a low frequency component to the sounds, of
      frequency lower than F1.</li>
   <li>Formants 4,5 are higher than F3.  They affect the quality of the voice.</li>
   <li>Formants 6,7,8 are weak, high frequency, additions to vowels to give
      a clearer sound.</li>
</ul>
<p>
<dt>
<strong>echo &nbsp;&lt;delay&gt; &lt;amplitude&gt;</strong><br>
<dd>   Parameter 1 gives the delay in mS  (0 to 250mS).<br>
   Parameter 2 gives the echo amplitude (0 to 100).<br>

   Adding some echo can give a clearer or more interesting sound,
   especially when listening through a domestic stereo sound system,
   rather than small computer speakers.
<p>
<dt>
<strong>flutter &nbsp;&lt;value&gt;</strong><br>
<dd>   Default value: 2.<br>

   Adds pitch fluctuations to give a wavering or older-sounding voice.
   A large value (eg. 20) makes the voice sound "croaky".
<p>
<dt>
<strong>roughness &nbsp;&lt;value&gt;</strong><br>
<dd>   Default value: 2. Range 0 - 7<br>

   Reduces the amplitude of alternate waveform cycles to make the voice sound creaky.
<p>
<dt>
<strong>words &nbsp;&lt;value&gt;</strong><br>
<dd>   Indicates to what extent words are separated from each other, or run
   together.  A default value is set by "language".
<ul>
   <li>0 &nbsp; run words together where appropriate</li>
   <li>1 &nbsp; don't merge phonemes between words</li>
   <li>2 &nbsp; as (1) but also ensure a short pause before stops and fricatives</li> 
   <li>3 &nbsp; short pause between words</li>
   <li>4 &nbsp; slightly longer pause between words</li>
</ul>
<p>
<dt>
<strong>replace &nbsp;&lt;phoneme&gt; &lt;replacement phoneme&gt;</strong><br>
<dd>   Replace a phoneme by another whenever it occurs.
   eg.
<pre>
      replace  h  NULL      // drops h's
      replace  V  U         // replaces vowel in 'strut' by that in 'foot'
                            // as occurs in northern British English
</pre>
   The phoneme mnemonics are listed in <A href="phonemes.html">phonemes.html</A>
</dd>
<p>
<dt>
<strong>replaceWE &nbsp;&lt;phoneme&gt; &lt;replacement phoneme&gt;</strong><br>
<dd>   Replace a phoneme only when it occurs at the end of a word
   eg.
<pre>
      replaceWE  N  n         // change 'fishing' to 'fishin' etc.
</pre>
<p>
<dt>
<strong>stressLength &nbsp;&lt;8 values&gt;</strong><br>
<dd>   Eight integer parameters.  These control the relative lengths of
   stressed and unstressed syllables.
<ul>
<li>      0 &nbsp; unstressed
</li><li>      1 &nbsp; diminished (weaker than 0, used within multisyllabic words)
</li><li>      2 &nbsp; secondary stress
</li><li>      3 &nbsp; words marked as "unstressed" in the dictionary
</li><li>      4 &nbsp; &nbsp;  not currently used
</li><li>      5 &nbsp; &nbsp;  not currently used
</li><li>      6 &nbsp; stressed syllable (the main syllable in stressed words)
</li><li>      7 &nbsp; tonic syllable (by default, the last stressed syllable in the clause)
</li></ul>
<p>
<dt>
<strong>intonation &nbsp;&lt;param1&gt; &lt;param2&gt;</strong><br>
<dd>   (for further development)<br>

      param1 is currently not used<br>
      param2 can be 0, 1, 2 (default=0) It affects how often the pitch
        rises and falls during a clause.

</dl>
<hr>

</body>
</html>
